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METHODS FOR THE DIAGNOSIS 
AND TREATMENT OF LUNG CANCER 

CROSS-REFERENCE TO RELATED APPLICATION 

This application claims priority under 35 U.S.C. § 1 19(e) to U.S. 
5 Provisional Application no. 60/080,044, filed March 31, 1998, the contents of which 
are hereby incorporated by reference into the present disclosure. 

TECHNICAL FIELD 

This invention is in the field of cancer biology. In particular, the 
present invention provides compositions and methods for diagnosing and treating a 
10 neoplastic lung cell characterized by an oyerexpression of a proto-oncogene. 

BACKGROUND 

Despite numerous advances in medical research, cancer remains the 
second leading cause of death in the United States. In the nations, roughly one in five 
persons will die of cancer. Traditional modes of clinical care, such as surgical 

15 resection, radiotherapy and chemotherapy, have a significant failure rate, especially 
for solid tumors. Failure occurs either because the initial tumor is unresponsive, or 
because of recurrence due to regrowth at the original site and/or metastases. 

Lung cancer is one of the most common malignancies worldwide and 
is the second leading cause of cancer death in man. See, American Cancer Society, 

20 Cancer facts and figures, 1996, Atlanta. Approximately 178,100 new cases of lung 
cancer were to be diagnosed in 1997, accounting for 13% of cancer diagnoses. An 
estimated 160,400 deaths due to lung cancer would occur in 1997 accounting for 29% 
of all cancer deaths. The year survival rates for lung cancer have increased firom 32% 
in 1973 to 41% in 1993, largely due to improvements in sixrgical techniques. The 5 

25 year survival rate for all stages combined is only 14%. The survival rate is 48% for 
cases detected when the disease is still localized, but only 15% of lung cancers are 
discovered that early. Among various forms of limg cancer, non-small cell lung 
cancer (NSCLC) accounts for nearly 80% of all new lung cancer cases each year. For 
patients diagnosed with NSCLC, surgical resection offers the only chance of 

30 meaningfiil survival. On the other hand, small cell lung cancer is the most malignant 
and fastest growing form of lung cancer and accounts for the rest of approximately 
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20% of new cases of lung cancer. The primary tumor is generally responsive to 
chemotherapy, but is followed by wide-spread metastasis. The median survival time 
at diagnosis is approximately 1 year, with a 5 year survival rate of 5%. 

In spite of major advances in cancer therapy including improvements 
5 in surgical resection, radiation treatment and chemotherapy, successful intervention 
for lung cancer in particular, relies on early detection of the cancerous cells. 
Neoplasia resulting in benign tumors may be completely cured by removing the mass 
surgically. If a tumor becomes maUgnant, as manifested by invasion of surrounding 
tissue, it becomes much more to eradicate. Therefore, there remains a considerable 

10 need in the art for the development of methods for detecting the disease at the early 
stage. There also exits a pressing need in the art for developing diagnostic methods to 
monitor or prognose the progression of the disease as well as methods to treat various 
conditions. However, the vast variability in the nature of the disease has rendered the 
search for cellular markers, such as genes that are preferably overexpressed in 

15 primary lung cancer cells and useful for diagnostic and therapeutic methods, difficult. 

Tumor often results genetic alterations occurring spontaneously, or 
from viral infection, or in response to chemical carcinogens or radiation. Genes 
responsible for transforming a normal cell to a cancer (or neoplastic) cell are known 
as oncogenes. With the advent of recombinant DNA technology, a large number of 

20 oncogenes have been identified, cloned and sequenced. In most cases, the identified 
oncogenes are in fact altered form of one of the same native cellular genes, known as 
the proto-oncogenes. Related techniques have revealed that cell transformation can 
also be caused by overproduction of certain normal gene products. Overexpression of 
proto-oncogene may result from an amplification of the gene copies, or a 

25 chromosomal rearrangement that has brought the proto-oncogene under the control of 
an inappropriate regulatory element, such as a constitutively activated promoter. The 
present invention provides the first identification of proto-oncogenes that are 
preferably overexpressed in primary lung cancer cells. In addition, the methods 
described herein provide a significant contribution to the area of limg cancer 

30 diagnosis, monitoring and treatment. 
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DISCLOSURE OF THE INVENTION 

The present invention provides methods for aiding in the diagnoses of 
the neoplastic condition of a lung cell, and methods of screening for a potential 
therapeutic agent for the reversal of the neo plastic condition. 
5 Accordingly, one embodiment of this invention is a method of 

diagnosing the neoplastic condition of a lung cell by screening for the presence of an 
overexpressed proto-oncogene from a lung cell sample, in which the overexpression is 
indicative of the neoplastic state of the lung cell. In one aspect of this embodiment, 
the overexpressed proto-oncogene is b-myb or p67, and in another aspect, the 

10 overexpressed proto-oncogene is PGP9.5 or 8-oxo-dGTPase. The overexpression of 
the proto-oncogene embodied in the present invention is determined by detecting the 
quantity of mRNA transcribed from the proto-oncogene, or the quantity of cDNA 
produced from the reverse transcription of the mRNA, or the quantity of the 
polypeptide or protein encoded by the proto-oncogene. The methods are particularly 

15 usefiil for aiding in the diagnosis of non-small cell lung cancer. 

Another embodiment of the invention is a screen for a potential 
therapeutic agent for the reversal of the neoplastic condition of a lung cell, wherein 
the cell is characterized by overexpression of a proto-oncogene. The method 
comprises contacting the cell with an effective amount of a potential agent and 

20 assaying for reversal of the neoplastic condition. In one aspect of this embodiment, 
the proto-oncogene is one or more of any of b-myb, p67, PGP9.5 or 8-oxo-dGTPase. 

Yet another embodiment of the present invention is a method of 
reversing the neoplastic condition of a limg cell, wherein the cell is characterized by 
overexpression of a proto-oncogene by contacting the cell with an agent identified by 

25 the above-mentioned method. The identified agent can be, but is not limited to, anti- 
sense RNA that specifically inhibits the overexpression of the proto-oncogene. The 
overexpressed proto-oncogene is any of b-myb, p67, 8-oxo-dGTPase or PGP9.5. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 depicts the relative expression of all unique tags in lung tumor 
30 and normal tissues. The ratio of tags present in the tumor and normal control was 
determined using the sum of the tags fi*om each of the two libraries. In the case of a 
zero appearance, a number of 1 was inserted to allow the ratio to become a 
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meaningful value. The x-axis indicates the fold induction in either system. A positive 
value indicates fold induction in the nomial and a negative value indicate fold 
induction in the tumor. The y-axis indicates the number of genes differentially 
expressed where y - 10". 
5 Figure 2 depicts Northern blot analysis of genes identified by SAGE. 

Total RNAs were isolated firom normal lung, small bowel, liver, and a panel of 9 limg 
tumor cell lines as indicated. All cell lines were obtained firom ATCC and previously 
described in Gazdar et al (1996) J. Cell Biochem. Suppl. 24:1-11; and Phelps et al 
(1996) J. Cell Biochem Suppl. 24:32-91. 

10 Figure 3 A depicts the detection of the human b-myb and PGP9.5 

transcripts in primary lirng cancers. Upper panel: cases L002, LOOS, LOlO, L012, 84, 
85, 86, 88, 90, 91 have b-myb transcripts in their lung tumors. Lower panel: cases 
L002, LOOS, L012, 84, 85, 86, 88, 91 have PGP9.5 transcripts in their lung tumors. 
Figure 3B depicts expression pattem of the human PGP9.5 transcript in lung cancer 

15 cell lines of different ASHl status. PGP9.5 transcript was not detectable in either 
normal lung or an e mbroyonio embryonic lung cell line (LI 32) but was expressed in 
lung tumor cell lines with (H82, H125, H157, H520, H1299, H358) and without 
(H249, HI 155, DMS53, H727, H146, H1770) neuroendocrine features. H358 has 
lower expression compared to other cell lines. The bottom panel depicts a Western 

20 blot using anti-PGP9.5 antibody for detecting PGP9.5 protein expressed in limg tumor 
cell lines. PGP9.5 was expressed in all cell lines except H358. 

MODES FOR CARRYING OUT THE INVENTION 

Throughout this disclosure, various publications, patents and published 
patent specifications are referenced by an identifying citation. The disclosm-es of 
25 these publications, patents and published patent specifications are hereby incorporated 
by reference into the present disclosure to more fully describe the state of the art to 
which this invention pertains. 

Definitions 

The practice of the present invention will employ, unless otherwise 
30 indicated, conventional techniques of immunology, molecular biology, microbiology, 
cell biology and recombinant DNA, which are within the skill of the art. See, e.^., 
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Sambrook, Fritsch and Maniatis, MOLECULAR CLONING: A LABORATORY manual, T 
edition (1989); CURRENT PROTOCOLS IN molecular biology (F. M. Ausubel, et ah 
eds., (1987)); the series methods in enzymology (Academic Press, Inc.): PGR 2: A 
PRACTICAL APPROACH (MJ. MacPherson, B.D. Hames and G.R. Taylor eds. (1995)), 

5 Harlow and Lane, eds. (1988) antibodies, a laboratory manual, and animal 
CELL culture (R.I. Freshney, ed. (1987)). 

As used in the specification and claims, the singular form "a", "an" and 
"the" include plural references unless the context clearly dictates otherwise. For 
example, the term "a cell" includes a plurality of cells, including mixtures thereof 

10 As used herein, the term "comprising" is intended to mean that the 

compositions and methods include the recited elements, but not excluding others. 
"Consisting essentially of when used to define compositions and methods, shall 
mean excluding other elements of any essential significance to the combination. 
Thus, a composition consisting essentially of the elements as defined herein would 

15 not exclude trace contaminants firom the isolation and purification method and 

pharmaceutically acceptable carriers, such as phosphate buffered saline, preservatives, 
and the like. "Consisting of shall mean excluding more than trace elements of other 
ingredients and substantial method steps for administering the compositions of this 
invention. Embodiments defined by each of these transition terms are within the 

20 scope of this invention. 

The term "polypeptide" is used in its broadest sense to refer to a 
compound of two or more subunit amino acids, amino acid analogs, or 
peptidomimetics. The subunits may be Unked by peptide bonds. In another 
embodiment, the subunit may be Linked by other bonds, e.g, ester, ether, etc. As used 

25 herein the term "amino acid" refers to either natural and/or uimatural or synthetic 
amino acids, including glycine and both the D or L optical isomers, and amino acid 
analogs and peptidomimetics. A peptide of three or more amino acids is commonly 
called an oligopeptide if the peptide chain is short. If the peptide chain is long, the 
peptide is commonly called a polypeptide or a protein. 

30 The term '^isolated" means separated fi-om constituents, cellular and 

otherwise, in which the polynucleotide, peptide, polypeptide, protein, antibody, or 
firagments thereof, are normally associated with in nature. In one aspect of this 
invention, an isolated polynucleotide is separated fi-om the 3' and 5' contiguous 
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nucleotides with which it is normally associated with in its native or natural 
environment, e.g., on the chromosome. As is apparent to those of skill in the art, a 
non-naturally occurring polynucleotide, peptide, polypeptide, protein, antibody, or 
fragments thereof, does not require "isolation" to distinguish it from its naturally 
5 occurring counterpart. In addition, a "concentrated", "separated" or "diluted" 
polynucleotide, peptide, polypeptide, protein, antibody, or fragments thereof, is 
distinguishable from its naturally occurring counterpart in that the concentration or 
number of molecules per volume is greater than "concentrated" or less than 
"separated" than that of its naturally occurring counterpart. A polynucleotide, 

10 peptide, polypeptide, protein, antibody, or fragments thereof, which differs from the 
naturally occurring counterpart in its primary sequence or for example, by its 
glycosylation pattern, need not be present in its isolated form since it is 
distinguishable from its naturally occurring counterpart by its primary sequence, or 
alternatively, by another characteristic such as glycosylation pattern. Although not 

15 explicitly stated for each of the inventions disclosed herein, it is to be vmderstood that 
all of the above embodiments for each of the compositions disclosed below and under 
the appropriate conditions, are provided by this invention. Thus, a non-naturally 
occurring polynucleotide is provided as a separate embodiment from the isolated 
naturally occurring polynucleotide. A protein produced in a bacterial Cell is provided 

20 as a separate embodiment from the naturally occurring protein isolated from a 
Qucarvotic eukarvotic cell in which it is produced in nature. 

The terms "polynucleotide" and "oligonucleotide" are used 
interchangeably, and refer to a polymeric form of nucleotides of any length, either 
deoxyribonucleotides or ribonucleotides, or analogs thereof. Polynucleotides may 

25 have any three-dimensional structure, and may perform any ftmction, known or 
unknown. The following are non-limiting examples of polynucleotides: a gene or 
gene fragment, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal 
RNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, 
plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, 

30 nucleic acid probes, and primers. A polynucleotide may comprise modified 

nucleotides, such as methylated nucleotides and nucleotide analogs. If present, 
modifications to the nucleotide structure may be imparted before or after assembly of 
the polymer. The sequence of nucleotides may be interrupted by non-nucleotide 
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components. A polynucleotide may be further modified after polymerization, such as 
by conjugation with a labeling component. The term also refers to both double- and 
single-stranded molecules. Unless otherwise specified or required, any embodiment 
of this invention that is a polynucleotide encompasses both the double-stranded form 
and each of two complementary single-stranded forms known or predicted to make up 
the double-stranded form. 

A polynucleotide is composed of a specific sequence of four nucleotide 
bases: adenine (A); cytosine (C); guanine (G); thymine (T); and uracil (U) for 
thymine (T) when the polynucleotide is RNA. Thus, the term "polynucleotide 
sequence" is the alphabetical representation of a polynucleotide molecule. This 
alphabetical representation can be input into databases in a computer having a central 
processing imit and used for bioinfomiatics appUcations such as fimctional genomics 
and homology searching. 

A "gene" refers to a polynucleotide containing at least one open 
reading fi-ame that is capable of encoding a particular polypeptide or protein after 
being transcribed and translated. Any of the polynucleotides sequences described 
herein may be used to identify larger fragments or full-length coding sequences of the 
gene with which they are associated. Methods of isolating larger fragment sequences 
are known to those of skill in the art, some of which are described herein. 

A "oncogene" refers to a polynuclotid e p olvnucleotide containing at 
least one open reading frame, that is capable of transforming a normal cell to a 
cancerous (or neoplastic or tumor) cell when introduced into a host cell. Oncogenes 
are often altered forms of the cellular counterpart, namely the "proto-oncogenes" that 
are incapable of cell transformation when expressed at the level present in a non- 
cancer cell. 

A "gene product" refers to the amino acid (e.g., peptide or polypeptide) 
generated when a gene is transcribed and translated. 

As used herein a second polynucleotide "corresponds to" another (a 
first) polynucleotide if it is related to the first polynucleotide by any oif the following 
relationships: 

1) The second polynucleotide comprises the first polynucleotide 
and the second polynucleotide encodes a gene product. 
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2) The second polynucleotide is 5' or 3' to the first polynucleotide 
in cDNA, RNA, genomic DNA, or a fi-agment of any of these 
polynucleotides. For example, a second polynucleotide may be 
a firagment of a gene that includes the first and second 
polynucleotides. The first and second polynucleotides are 
related in that they are components of the gene coding for a 
gene product, such as a protein. However, it is not necessary 
that the second polynucleotide comprises or overlaps with the 
first polynucleotide to be "corresponding to" as used herein. 
For example, the first polynucleotide may be a fragment of a 3' 
untranslated region of the second polynucleotide. The first and 
second polynucleotide may be fragments of a gene coding for a 
gene product. The second polynucleotide may be an exon of 
the gene while the first polynucleotide may be an intron of the 
gene. 

3) The second polynucleotide is the complement of the first 
polynucleotide. 

A "probe" when used in ttie context of polynucleotide manipulation 
refers to an oUgonucleotide that is provided as a reagent to detect a target potentially 
present in a sample of interest by hybridizing with the target. Usually, a probe will 
comprise a label or a means by which a label can be attached, either before or 
subsequent to the hybridization reaction. Suitable labels include, but are not limited 
to radioisotopes; fluorochromes, chemiluminescent compounds, dyes, and proteins, 
including enzymes. 

A "primer" is a short polynucleotide, generally with a free 3' -OH 
group that binds to a target or "template" potentially present in a sample of interest by 
hybridizing with the target, and thereafter promoting polymerization of a 
polynucleotide complementary to the target. A "polymerase chain reaction" ("PGR") 
is a reaction in which replicate copies are made of a target polynucleotide using a 
"pair of primers" or a "set of primers" consisting of an "upstream" and a 
"downstream" primer, and a catalyst of polymerization, such as a DN A polymerase, 
and typically a therinally-stable polymerase enzyme. Methods for PGR are well 
known in the art, and taught, for example in "PCR: A practical approach" (M. 

-8- 

PA/52153077.1 



GZ 2018.00 
WO 03/015748 



MacPherson et al, IRL Press at Oxford University Press (1991)). All processes of 
producing replicate copies of a polynucleotide, such as PGR or gene cloning, are 
collectively referred to herein as "replication." A primer can also be used as a probe 
in hybridization reactions, such as Southern or Northern blot analyses. Sambrook 
et al, supra. 

A "sequence tag" or "SAGE tag" is a short sequence, generally under 
about 20 nucleotides, that occurs in a certain position in messenger RNA. The tag can 
be used to identify the corresponding transcript and gene from which it was 
transcribed. A "ditag" is a dimer of two sequence tags. 

An expression "database" denotes a set of stored data that represent a 
collection of sequences, which in turn represent a collection of biological reference 
materials. 

The term "cDNAs" refers to complementary DNA, that is mRNA 
molecules present in a cell or organism made in to cDNA with an enzyme such as 
reverse transcriptase. A "cDNA library" is a collection of all of the mRNA molecules 
present in a cell or organism, all turned into cDNA molecules with the enzyme 
reverse transcriptase, then inserted into "vectors" (other DNA molecules that can 
continue to replicate after addition of foreign DNA). Exemplary vectors for libraries 
include bacteriophage (also known as "phage"), viruses that infect bacteria, for 
example, lambda phage. The library can then be probed for the specific cDNA (and 
thus mRNA) of interest. 

"Differentially expressed" as applied to a gene, refers to the 
differential production of the mRNA transcribed from the gene or the protein product 
encoded by the gene. A differentially expressed gene may be overexpressed or 
underexpressed as compared to the expression level of a normal or control cell. In 
one aspect, it refers to a differential that is 2.5 times, preferably 5 times, or preferably 
10 times higher or lower than the expression level detected in a control sample. The 
term "differentially expressed" also refers to nucleotide sequences in a cell or tissue 
which are expressed where silent in a control cell or not expressed where expressed in 
a control cell. 

The term "anti-sense RNA" or "asRNA" refers to an RNA molecule 
having a nucleotide sequence that is complementary to a specific mRNA sequence. 
Antisense RNA can be synthesized by methods knovra in the art, for example by 
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splicing the gene of interest in a reverse orientation to a viral promoter so that the 
coding strand is transcribed, (see, Armentano et al (1987) J. Virol. 61:1647-1650 for 
discussion of one suitable promoter). The isolated asRNA produced by the vector can 
then be introduced into an organism where it combines with naturally occurring 
mRNA to form duplexes which block either further transcription or translation. 
Methods of introducing asRNA into cells include, but are not limited to, non-toxic 
cationic lipids as described in Chiang et al (1991) J. Biol. Chem. 266:18162-18171. 
A review of anti-sense therapy can be found in C.A. Stein (1999) Nature Biotech. 
17(3):209. 

As used herein, "soUd phase support" or "solid support", used 
interchangeably, is not limited to a specific type of support. Rather a large number of 
supports are available and are known to one of ordinary sldll in the art. Solid phase 
supports include silica gels, resins, derivatized plastic films, glass beads, cotton, 
plastic beads, alumina gels. As used herein, "sohd support" also includes synthetic 
antigen-presenting matrices, cells, and liposomes. A suitable solid phase support may 
be selected on the basis of desired end use and suitability for various protocols. For 
example, for peptide synthesis, soUd phase support may refer to resins such as 
polystyrene {e.g., phenvlacetamidomethvl resin fP AM-resinl obtained fi:om Bachem 
Inc., Peninsula Laboratories, etc.), PQLYHIPE® p olymerized High Internal Ph ase 
Emulsion (polvHIPE) r esin (obtained from Aminotech, Canada), polyamide resin 
(obtained from Peninsula Laboratories), polystyrene resin grafted with polyethylene 
glycol r T Q ntaG e l© TENTAGEL™. Rapp Polymere, Tubingen, Germany) or 
polydimethylacrylamide resin (obtained from Milligen/Biosearch, California). 

A polynucleotide also can be attached to a solid support for use in high 
throughput screening assays. PCT WO 97/10365, for example, discloses the 
construction of high density oUgonucleotide chips. See also, U.S. Patent Nos. 
5,405,783; 5,412,087; and 5,445,934. Using this method, the probes are synthesized 
on a derivatized glass surface also known as chip arrays. Photoprotected nucleoside 
phosphoramidites are coupled to the glass surface, selectively deprotected by 
photolysis through a photolithographic mask, and reacted with a second protected 
nucleoside phosphoramidite. The coupling/deprotection process is repeated until the 
desired probe is complete. 
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As used herein, "expression" refers to the process by which 
polynucleotides are transcribed into mRNA and/or the process by which the 
transcribed mRNA is subsequently being translated into peptides, polypeptides, or 
proteins. If the polynucleotide is derived from genomic DNA, expression may 
5 include splicing of the mRNA in an eukaryotic cell. "Overexpression" as applied to a 
gene, refers to the overproduction of the mRNA transcribed from the gene or the 
protein product encoded by the gene, at a level that is 2.5 times higher, preferably 5 
times higher, more preferably 10 times higher than the expression level detected in a 
control sample. 

1 0 "Hybridization" refers to a reaction in which one or more 

polynucleotides react to form a complex that is stabiUzed via hydrogen bonding 
between the bases of the nucleotide residues. The hydrogen bonding may occur by 
Watson-Crick base pairing, Hoogstein binding, or in any other sequence-specific 
maimer. The complex may comprise two strands forming a duplex structure, three or 

15 more strands forming a multi-stranded complex, a single self-hybridizing strand, or 
any combination of these. A hybridization reaction may constitute a step in a more 
extensive process, such as the initiation of a PGR reaction, or the enzymatic cleavage 
of a polynucleotide by a ribozyihe. 

Hybridization reactions can be performed under conditions of different 

20 "stringency*. In general, a low stringency hybridization reaction is carried out at 
about 40 °C in 10 X SSC or a solution of equivalent ionic strength/temperature. A 
moderate stringency hybridization is typically performed at about 50 °C in 6 x SSC, 
and a high stringency hybridization reaction is generally performed at about 60 "^C in 
1 X SSC. 

25 When hybridization occurs in an antiparallel configuration between 

two single-stranded polynucleotides, the reaction is called "annealing" and those 
polynucleotides are described as "complementary". A double-stranded 
polynucleotide can be "complementary" or "homologous" to another polynucleotide, 
if hybridization can occur between one of the strands of the first polynucleotide and 

30 the second. "Coriiplemehtarity" or "homology" (the degree that one polynucleotide is 
complementary with another) is quantifiable in terms of the proportion of bases in 
opposing strands that are expected to form hydrogen bonding with each other, 
according to generally accepted base-pairing rules. 
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A polynucleotide or polynucleotide region (or a polypeptide or 
polypeptide region) has a certain percentage (for example, 80%, 85%, 90%, or 95%) 
of "sequence identity" to another sequence means that, when aligned, that percentage 
ofbases (or amino acids) are the same in comparing the two sequences. This 
ahgnment and the percent homology or sequence identity can be determined using 
software programs known in the art, for example those described in current 
PROTOCOLS IN MOLECULAR BIOLOGY (F.M. Ausubel et al, eds., 1987) Supplement 30, 
section 7.7.18, Table 7.7.1. Preferably, default parameters are used for aUgnment. A 
preferred aUgnment program is BLAST, using default parameters. In particular, 
preferred programs are BLASTN and BLASTP, using the following defauU 
parameters: Genetic code = standard; filter = none; strand = both; cutoff = 60; expect 
10; Matrix = BLOSUM62; Descriptions =50 sequences; sort by = HIGH SCORE; 
Databases = non-redundant, GenBank + EMBL + DDBJ + PDB + GenBank CDS 
translations + SwissProtein + SPupdate + PIR. Details of these programs can be 
found at the following Internet address: http://www.ncbi.nlm.nih.gov/cgi-bin/BLAST. 

As used herein, the terms "neoplastic cells", "neoplasia", "tumor", 
"tumor cells", "cancer" and "cancer cells", (used interchangeably) refer to cells which 
exhibit relatively autonomous growth, so that they exhibit an aberrant growth 
phenotype characterized by a significant loss of control of cell proliferation (i.e., de- 
regulated cell division). Neoplastic cells can be malignant or benign. A metastatic 
cell or tissue means that the cell can invade and destroy neighboring body structures. 

"Suppressing" tumor growth indicates a growth state that is curtailed 
when compared to growth without contact with educated, antigen-specific immune 
effector cells described herein. Tumor cell growth can be assessed by any means 
known in the art, including, but not limited to, measuring tumor size, determining 
whether tumor cells are proUferating using a ^H-thymidine incorporation assay, or 
counting tumor cells. "Suppressing" tumor cell growth means any or all of the 
following states: slowing, delaying, and stopping tumor growth, as well as tumor 
shrinkage. 

Hyperplasia is a form of controlled cell proliferation involving an 
increase in cell number in a tissue or organ, without significant alteration in structure 
or fimction. Metaplasia is a form of controlled cell growth in which one type of fully 
differentiated cell substitutes for another type of differentiated cell. Metaplasia can 
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occur in epithelial or connective tissue cells. Atypical metaplasia involves a 
somewhat disorderly metaplastic epithelium. 

A "composition" is intended to mean a combination of active agent 
and another comppxmd or composition, inert (for example, a detectable agent or label) 

5 or active, such as an adjuvant. 

A "pharmaceutical composition" is intended to include the 
combination of an active agent with a carrier, inert or active, making the composition 
suitable for diagnostic or therapeutic use in vitro, in vivo or ex vivo. 

As used herein, the term "pharmaceutically acceptable carrier" 

10 encompasses any of the standard pharmaceutical carriers, such as a phosphate 
buffered saline solution, water, and emulsions, such as an oil/water or water/oil 
emulsion, and various types of wetting agents. The compositions also can include 
stabilizers and preservatives. For examples of carriers, stabilizers and adjuvants, see 
Martin, REMINGTON'S PHARM. SCI., 15th Ed. (Mack Publ. Co., Easton (1975)). 

15 An "effective amount" is an amount sufficient to effect beneficial or 

desired results. An effective amount can be administered in one or more 
administrations, applications or dosages. 

A "subject," "individual" or "patient" is used interchangeably herein, 
which refers to a vertebrate, preferably a mammal, more preferably a human. 

20 Mammals include, but are not limited to, murines, simians, humans, fann animals, 
sport animals, and pets. 

A "control" is an alternative subject or sample used in an experiment 
for comparison purpose. A control can be "positive" or "negative". For example, 
where the purpose of the experiment is to determine a correlation of ah altered 

25 expression level of a proto-oncogene with a particular type of cancer, it is generally 
preferable to use a positive control (a subject or a sample from a subject, carrying 
such alteration and exhibiting syndromes characteristic of that disease), and a negative 
control (a subject or a sample from a subject lacking the altered expression and 
clinical syndrome of that disease). 

30 As noted above, this invention provides various methods for aiding in 

the diagnosis of the neoplastic state of a lung cell that is characterized by abnormal 
cell growth in the form of, e.g,, malignancy, hyperplasia or metaplasia. The 
neoplastic state of a cell generally is determined by noting whether the growth of the 
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cell is not governed by the usual limitation of normal growth. For the purposes of this 

invention, the term also is to include genotypic changes that occiu: prior to detection 

i 

of this growth in the form of a tumor and are causative of these phenotypic changes. 
The phenotypic changes associated with the neoplastic state of a cell (a set of zw vitro 
5 characteristics associated with a tumorigenic ability in vivo) include a more rounded 
cell morphology, looser substratum attachment, loss of contact inhibition, loss of 
anchorage dependence, release of proteases such as plasminogen activator, increased 
sugar transport, decreased serum requirement, expression of fetal antigens and the 
like. (See Luria, et al (1978) GENERAL VIROLOGY, 3** edition, 436-446 (John 

1 0 Wiley & Sons, New York).). 

The cell of this invention is characterized by overexpression of a proto- 
oncogene selected from the group of proto-onco genes 8-oxo-dGTPase, b-myb, p67, or 
PGP9.5. In one embodiment, overexpression is determined by assaying for the 
expression of the gene in the test system and its absence in the control. In one 

15 embodiment, overexpression is determined by an increase by 2.5 fold, preferably 5 
fold, more preferably 10 fold in the level of proto-oncogene mRNA. In a separate 
embodiment, an augmentation in the level of the polypeptide or protein encoded by 
the proto-oncogene is indicative of the presence of the neoplastic condition of the cell. 
The method can be used for aiding in the diagnosis of a lung cancer such as non-small 

20 cell lung cancer by detecting a genotype that is correlated with a phenotype 

characteristic of primary lung tumor cells. Thus, by detecting this genotype prior to 
tumor growth, one can predict a predisposition to cancer or provide early diagnosis. 

Cell or tissue samples used for this invention encompass body fluid, 
solid tissue samples, tissue cultures or cells derived therefrom and the progeny 

25 thereof, and sections or smears prepared from any of these sources, or any other 
samples that may contain a lung cell having a proto-oncogenes b-myb, 8-oxo- 
dGTPase, p67 or PGP9.5 or their gene products. A preferred sample is one that is 
prepared from a subject's lung tissue. 

In assaying for an alteration in mRNA level, nucleic acid contained in 

30 the aforementioned samples is first extracted according to standard methods in the art. 
For instance, mRNA can be isolated using various lytic enzymes or chemical 
solutions according to the procedures set forth in Sambrook et al (1989), supra or 
extracted by nucleic-acid-binding resins following the accompanying instructions 
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provided by manufactures. The mRNA of a proto-oncogene of interest contained in 
the extracted nucleic acid sample is then detected by hybridization {e,g. Northern blot 
analysis) and/or amplification procedures according to methods widely known in the 
art or based on the methods exemplified herein. 

5 Nucleic acid molecules having at least 10 nucleotides and exhibiting 

sequence complementarity or homology to the proto-oncogenes described herein find 
utility as hybridization probes. It is known in the art that a "perfectly matched" probe 
is not needed for a specific hybridization. Minor changes inprobe sequence achieved 
by substitution, deletion or insertion of a small number of bases do not affect the 

10 hybridization specificity. In general, as much as 20% base-pair mismatch (when 
optimally aligned) can be tolerated. Preferably, a probe useful for detecting the 
aforementioned proto-oncogene mRNA is at least about 80% identical to the 
homologous region of comparable size contained in the previously identified 
sequences, which have the GonBanlc® GENBANK® database accession nxmibers 

15 identified in Table 2 (below). More preferably, the probe is 85% identical to the 

corresponding gene sequence after alignment of the homologous region; even more 
preferably, it exhibits 90% identity. Specifically, a preferred probe for b-myb is 
TGCTGCCCTG (SEQ ID NO. 1), a preferred probe for PGP9.5 is CAGTCTAAAA 
(SEQ ID NO. 2), a preferred probe for 8-oxo-dGTPase is TGGCCCGACG (SEQ ID 

20 NO. 3), and a preferred probe for p67 is TAATACTTTT (SEQ ID NO. 4), or their 
respective complements Additional probes can be derived firom sequences for these 
genes identified by the GonBanlc® GENBANK® Accession numbers provided in 
Table 2 or to a homologous region of comparable size contained in the previously 
identified sequences, which have the GonBanlc® GENBANK® accession numbers 

25 identified in Table 2. These probes can be used in radioassays (e.g. Southern and 
Northern blot analysis) to detect, prognose, diagnose or monitor various neoplastic 
states resulting fi-om overexpression of PGP9.5, p67, 8-oxo-dGTPase or the b-myb 
genes. The total size of the fi-agment, as well as the size of the complementary 
stretches, will depend on the intended use or application of the particular nucleic acid 

30 segment. Smaller firagments derived fi*om the known sequences will generally find 
use in hybridization embodiments, wherein the length of the complementary region 
may be varied, such as between about 10 and about 100 nucleotides, or even fiill 
length according to the complementary sequences one wishes to detect. 
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Nucleotide probes having complementary sequences over stretches 
greater than about 10 nucleotides in length are generally preferred, so as to increase 
stability and selectivity of the hybrid, and thereby improving the specificity of 
particular hybrid molecules obtained. More preferably, one can design nucleic acid 
molecules having gene-complementary stretches of more than about 25 and even 
more preferably more than about 50 nucleotides in length, or even longer where 
desired. Such fragments may be readily prepared by, for example, directly 
synthesizing the fragment by chemical means, by application of nucleic acid 
reproduction technology, such as the PCR[[^]]rtechnology with two priming 
oligonucleotides as described in U.S. Pat. No. 4,603,102 or by introducing selected 
sequences into recombinant vectors for recombinant production. A preferred probe is 
about 50 to about 75 or more preferably, about 50 to about 100, nucleotides in length. 

In certain embodiments, it will be advantageous to employ nucleic acid 
sequences of the present invention in combination with an appropriate means, such as 
a label, for detecting hybridization and therefore complementary sequences. A wide 
variety of appropriate indicator means are known in the art, including fluorescent, 
radioactive, enzymatic or other ligands, such as avidin/biotin, which are capable of 
giving a detectable signal. In preferred embodiments, one will likely desire to employ 
a fluorescent label or an enzyme tag, such as urease, alkaline phosphatase or 
peroxidase, instead of radioactive or other environmental undesirable reagents. In the 
case of enzyme tags, colorimetric indicator substrates are known which can be 
employed to provide a means visible to the human eye or spectrophotometrically, to 
identify specific hybridization with complementary nucleic acid-containing samples. 

Hybridization reactions can be performed under conditions of different 
"stringency". Relevant conditions include temperature, ionic strength, time of 
incubation, the presence of additional solutes in the reaction mixture such as 
formamide, and the washing procedure. Higher stringency conditions are those 
conditions, such as higher temperature and lower sodium ion concentration, which 
require higher minimum complementarity between hybridizing elements for a stable 
hybridization complex to form. Conditions that increase the stringency of a 
hybridization reaction are widely known and published in the art. See, for example, 
(Sambrook, et al, (1989), supra). 
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Briefly, multiple RNAs are isolated from cell or tissue samples as 
described above. Optionally, the gene transcripts can be converted to cDNA. A 
sampling of the gene transcripts are subjected to sequence-specific analysis and 
quantified. These gene transcript sequence abundances are compared against 
5 reference database sequence abundances including normal data sets for diseased and 
healthy patients. The patient has the disease(s) with which the patient's data set most 
closely correlates which includes the overexpression of the transcripts identified 
herein. 

The nucleotide probes of the present invention can also be used as 
10 primers and detection of genes or gene transcripts that are differentially expressed in 
certain body tissues. A preferred primer for b-myb is TGCTGCCCTG (SEQ ID NO. 
1), the preferred primer for PGP9.5 is CAGTCTAAAA (SEQ ID NO. 2), the 
preferred primer for 8-oxo-dGTPase is TGGCCCGACG (SEQ ID NO. 3), and a 
preferred probe is TAATACTTTT (SEQ ID NO. 4), or their respective complements. 
15 Additionally, a primer useful for detecting the aforementioned proto-oncogene mRNA 
is at least about 80% identical to the homologous region of comparable size contained 
in the previously identified sequences, which have the G e nBanlc GENBANK® 
accession numbers identified in Table 2. For the purpose of this invention, 
amplification means any method employing a primer-dependent polymerase capable 
20 of replicating a target sequence with reasonable fidelity. Amplification may be 
carried out by natural or recombinant DNA-polynierases such as T7 DNA 
polymerase, Klenow fragment of EreeU -E. coli D NA polymerase, and reverse 
transcriptase. 

A preferred amplification method is PGR. General procedures for 
25 PGR are taught in MacPherson et al , PGR: A practical approach, (IRL Press at 

Oxford University Press (1991)). However, PGR conditions used for each application 
reaction are empirically determined. A number of parameters influence the success of 
a reaction. Among them are annealing temperature and time, extension time, Mg2'*^ 
ATP concentration, pH, and the relative concentration of primers, templates, and 
30 deoxyribonucleotides. 

After amplification, the resulting DNA fragments can be detected by 
agarose gel electrophoresis followed by visualization with ethidium bromide staining 
and ultraviolet illumination. A specific amplification of proto-oncogenes such as 
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GPG9.5. PGP9.5, p 67, 8-oxo-dGTPase or b-myb can be verified by demonstrating 
that the ampUfied DNA fragment has the predicted size, exhibits the predicated 
restriction digestion pattem, and/or hybridizes to the correct cloned DNA sequence. 

The probes also can be attached to a solid support for Use in high 
5 throughput screening assays using methods known in the art. PCT WO 97/10365 and 
U.S. Patent numbers 5,405,783, 5,412,087 and 5,445,934, for example, disclose the 
construction of high density oligonucleotide chips which can contain one or more of 
the sequences disclosed herein. Using the methods disclosed in U.S. Patent numbers 
5,405,783, 5,412,087 and 5,445,934 the probes of this invention are synthesized on a 

10 derivatized glass surface. Photoprotected nucleoside phosphoramidites are coupled to 
the glass surface, selectively deprotected by photolysis through a photolithographic 
mask, and reacted with a second protected nucleoside phosphoramidite. The 
coupling/deprotection process is repeated until the desired probe is complete. 

The expression level of a proto-oncogene is determined through 

15 exposure of a nucleic acid sample to the probe-modified chip. Extracted nucleic acid 
is labeled, for example, with a fluorescent tag, preferably during an ainplification step. 
Hybridization of the labeled sample is performed at an appropriate stringency level. 
The degree of probe-nucleic acid hybridization is quantitatively measured using a 
detection device, such as a confocal microscope. See U.S. Pat Nos. 5,578,832 and 

20 5,631,734. The obtained measurement is directly correlated with proto-oncogene . 
expression level. 

More specifically, the probes and high density oligonucleotide probe 
arrays provide an effective means of monitoring expression of a multiplicity of genes. 
The expression monitoring methods of this invention may be used in a wide variety of 
25 circiunstances including detection of disease, identification of differential gene 

expression between two samples, or screening for compositions that upregulate or 
downregulate the expression of particular genes. 

In another preferred embodiment, the methods of this invention are 
used to monitor expression of the genes which specifically hybridize to the probes of 
30 this invention in response to defined stimuli, such as a drug. 

In one embodiment, the hybridized nucleic acids are detected by 
detecting one or more labels attached to the sample nucleic acids. The labels may be 
incorporated by any of a number of means well known to those of skill in the art. 
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However, in one aspect, the label is simultaneously incorporated during the 
amplification step in the preparation of the sample nucleic acid. Thus, for example, 
polymerase chain reaction (PGR) with labeled primers or labeled nucleotides will 
provide a labeled amplification product. In a separate embodiment, transcription 
5 amplification, as described above, using a labeled nucleotide (e.g, fluorescein-labeled 
UTP and/or GTP) incorporates a label in to the transcribed nucleic acids. 

Altematively, a label may be added directly to the original nucleic acid 
sample (e.g., mRNA, polyA, mRNA, cDNA, etc.) or to the amplification product after 
the amplification is completed. Means of attaching labels to nucleic acids are well 

10 known to those of skill in the art and include, for example nick translation or end- 
labeling (e.g. with a labeled RNA) by kinasing of the nucleic acid and subsequent 
attachment (ligation) of a nucleic acid linker joining the sample nucleic acid to a label 
(e.g., a fluorophore). 

Detectable labels suitable for use in the present invention include any 

15 composition detectable by spectroscopic, photochemical, biochemical, 

immunochemical, electrical, optical or chemical means. Usefixl labels in the present 
invention include biotin for staining with labeled streptavidin conjugate, magnetic 
beads (e.g., Dvnab e adG® DYNABEADS® , fluorescent dyes (e.g., fluorescein, Texas 
Red®, rhodamine, green fluorescent protein, and the like), radiolabels (e.g., "^H, ^^51, 

20 ^^S, ^^C, or ^^P) enzymes (e.g., horse radish peroxidase, alkaline phosphatase and 
others commonly used in an enzyme linked immunosorbent assay (" ELISA"")), and 
colorimetric labels such as colloidal gold or colored glass or plastic (e.g., polystyrene, 
polypropylene, latex, etc.) beads. Patents teaching the use of such labels include U.S. 
Patents Nos. 3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149; and 

25 4,366,241). 

Means of detecting such labels are well known to those of skill in the 
art. Thus, for example, radiolabels may be detected using photographic film or 
scintillation counters, fluorescent markers may be detected using a photodetector to 
detect emitted light. Enzymatic labels are typically detected by providing the enzyme 
30 with a substrate and detecting the reaction product produced by the action of the 

enzyme on the substrate, and colorimetric labels are detected by simply visualizing 
the colored label. 
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As described in more detail in WO 97/10365, the label may be added 
to the target (sample) nucleic acid(s) prior to, or after the hybridization. These are 
detectable labels that are directly attached to or incorporated into the target (sample) 
nucleic acid prior to hybridization. In contrast, "indirect labels" are joined to the 
5 hybrid duplex after hybridization. Often, the indirect label is attached to a binding 
moiety that has been attached to the target nucleic acid prior to the hybridization. 
Thus, for example, the target nucleic acid may be biotinylated before the 
hybridization. After hybridization, an avide H avidin -conjugated fluorophore will bind 
the biotin bearing hybrid duplexes providing a label that is easily detected. Fege -For a 
10 detailed review of methods of labeling nucleic acids and detecting labeled hybridized 

nucleic acids see LABORATORY TECHNIQUES IN BIOCHEMISTRY AND MOLECULAR 
BIOLOGY, Vol. 24: Hybridization with Nucleic Acid Probes, P. Tijssen, ed. Elsevier, 
N.Y. (1993). 

The nucleic acid sample also may be modified prior to hybridization to 
15 the high density probe array in order to reduce sample complexity thereby decreasing 

background signal and improving sensitivity of the measurement using the methods 

disclosed in WO 97/10365. 

Results from the chip assay are typically analyzed using a computer 

software program. See, for example, EP 0717 1 13 A2 and WO 95/20681. The 
20 hybridization data is read into the program, which calculates the expression level of 

the targeted gene(s). This figure is compared against existing data sets of gene 

expression levels for diseased and healthy individuals. A correlation between the 

obtained data and that of a set of diseased individuals indicates the onset of a disease 

in the subject patient. 

25 Also within the scope of this application is a data base useful for the 

detection of neoplastic lung tissue comprising one or more of the sequences identified 
herein as Sequence ID Nos. 1 through 4 or parts of the Sequences identified in Table 
2, below. 

These polynucleotide sequences are stored in a digital storage medium 
30 such that a data processing system for standardized representation of the genes that 
identify a lung cancer cell is compiled. The data processing system is useful to 
analyze gene expression between two cells by first selecting a cell suspected of being 
of a neoplastic phenotype or genotype and then isolating polynucleotides from the 
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cell. The isolated pol5niucleotides are sequenced. The sequences from the sample are 
compared with the sequence(s) present in the database using homology search 
techniques described above. Greater than 90%, more preferably greater than 95% and 
more preferably, greater than or equal to 97% sequence identity between the test 
5 sequence and at least one sequence identified by SEQ ID NO. 1 through 40 or its 
complement is a positive indication that the polynucleotide has been isolated from a 
lung cancer cell as defined above. 

Expression of the prato-oncogenes PGP9.5, p67, 8-6xo-dGTPase and 
b-myb genes can also be determined by examining the protein product. Determining 

10 the protein level involves a) providing a biological sample containing polypeptides; 
and (b) measuring the amount of any imnmno specific binding that occurs between an 
antibody reactive to PGP9.5, p67, 8-oxo-dGTPase or b-myb proteins and a component 
in the sample, in which the amount of immunospecific binding indicates the level of 
the proto-oucogene proteins. 

1 5 Antibodies that specifically recognize and bind to the protein products 

of these proto-oncogenes are required for immunoassays. These may be purchased 
from commercial vendors or generated and screened using methods well known in the 
art. See Harlow and Lane (1988) supra, and Sambrook et al (1989) supra, 

A variety of techniques are available in the art for protein analysis. 

20 They include but are not limited to radioimmunoassays, ELISA (enzyme [inked 

immuno radiom e tric sorbent assays), "sandwich" immunoassays, immunoradiometric 
assays, in situ immunoassays (using e.g.^ colloidal gold, enzyme or radioisotope 
labels), westem blot analysis, immxmoprecipitation assays, immunoflourescent 
assays, and PAGE-SDS. 

25 In diagnosing malignancy, hyperplasia or inetaplasia characterized by 

a differential expression.of proto-oncogenes, one typically conducts a comparative 
analysis of the subject and appropriate controls. Preferably, a diagnostic test includes 
a control sample derived from a subject (hereinafter "positive control"), that exhibits a 
detectable increase in protooncogene expression, preferably at a level of at least 2.5 

30 fold and clinical characteristics of the malignancy or metaplasia of interest. More 
preferably, a diagnosis also includes a control sample derived from a subject 
(hereinafter "negative control"), that lacks the clinical characteristics of the neoplastic 
state and whose expression level of the gene at question is within a normal range. A 
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positive correlation between the subject and the positive control with respect to the 
identified alterations indicates the presence of or a predisposition to said disease. A 
lack of correlation between the subject and the negative control confirms the 
diagnosis. In a preferred embodiment, the method is used for diagnosing Ixmg cancer, 
5 preferably non-small lung cancer, on the basis of an increase in PGP9.5, 8-oxo- 
dOTPase or b-myb mRNA level or protein level. 

There are various methods available in the art for quantifying mRNA 
or protein level fi-om a cell sample and indeed, any method that can quantify these 
levels is encompassed by this invention. For example, determination of the mR.NA 

10 level of the aforementioned proto-oncogenes may involve, in One aspect, measuring 
the amount of mRNA in a mRNA sample isolated firom the lung cell by hybridization 
or quantitative amplification using at least one oligonucleotide probe that is 
complementary to the mRNA. Determination of the aforementioned proto-oncogene 
products requires measuring the amoimt of immuxiospecific binding that occurs 

15 between an antibody reactive to the gene product of the proto-oncogene. To detect 

and quantify the immunospecific binding, or signals generated during hybridization or 
amplification procedures, digital image analysis systems including but not limited to 
those that detect radioactivity of the probes or chemiluminescence can be employed. 

The present invention also provides a screen for various agents and 

20 methods for reversing the .neoplastic condition of the cells or selectively inhibiting 
growth or proliferation of the cells described above. In one aspect, the screen assays 
for agents which are usefixl for the treatment of malignancy, hyperplasia or metaplasia 
characterized by overexpression of the proto-oncogenes. 

Thus, to practice the method in vitro, suitable cell cultxu-es or tissue 

25 cultures are first provided. The cell can be a cultured cell or a genetically modified 

cell which overexpresses a proto-oncogene associated with a neoplastic limg cell such 
as brmyb, p67, 8-oxo-dGTPase or PGP9.5. Alternatively, the cells can be fi-om a 
tissue biopsy. The cells are cultured under conditions (temperature, growth or culture 
medium and gas (C02)) and for an appropriate amount of time to attain exponential 

30 proliferation without density dependent constraints. It also is desirable to maintain an 
additional separate cell culture; one which does not receive the agent being tested as a 
control. 
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As is apparent to one of skill in the art, suitable cells may be cultured 
in microtiter plates and several agents may be assayed at the same time by noting 
genotypic changes, phenotypic changes andlor cell death. 

When the agent is a composition other than a DNA or RNA nucleic 
5 acid molecule, the suitable conditions may be by directly added to the cell culture or 
added to culture medium for addition. As is apparent to those skilled in the art, an 
"effective" amount must be added which can be empirically determined. 

The screen involves contacting the agent with a test cell characterized 
by overexpression of these proto-oncogenes and then assaying the cell for the level of 

10 protooncogene expression. In some aspects, it may be necessary to determine the 
level of protooncogene expression prior to the assay. This provides a base line to 
compare expression after administration of the agent to the cell culture. In another 
embodiment, the test cell is a cultured cell from an established cell line that 
constitutively over expresses these protooncogenes. Examples of these cell lines 

15 include, but are not limited the cell lines identified below. An agent is a possible 

therapeutic agent if the proto-oncogene expression is reduced to a level that is present 
in a cell in a normal or non-neoplastic state, or the cell selectively dies, or exhibits 
reduced rate of growth. 

For the purposes of this invention, an "agent" is intended to include, 

20 but not be limited to a biological or chemical compound such as a simple or complex 
organic or inorganic molecule, a peptide, a protein or an oligonucleotide. A vast array 
of compounds can be synthesized, for example oligomers, such as oligopeptides and 
oligonucleotides, and synthetic organic compounds based on various core stmctures, 
and these are also included in the term "agent". In addition, various natural sources 

25 can provide compounds for screening, such as plant or animal extracts, and the like. 
It should be imderstood, although not always explicitly stated that the agent is used 
alone or in combination with another agent, having the same or different biological 
activity as the agents identified by the inventive screen. The agents and methods also 
are intended to be combined with other therapies. 

30 As used herein, the term "reversing the neoplastic state of the cell" is 

intended to include apoptosis, necrosis or any other means of preventing cell division, 
reduced tumorigenicity, loss of pharmaceutical resistance, maturation, differentiation 
or reversion of the neoplastic phenotypes as described herein. As noted above, lung 
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cells having overexpression of a proto-oncogene that results in the neoplastic state are 
suitably treated by this method. These cells can be identified by any method known 
in the art that allows for the identification of overexpression of the proto-oncogene. 
One such method is exemplified below. 
5 When the agent is a nucleic acid, it can be added to the cell cultures by 

methods well known in the art, which includes, but is not limited to calcium 
phosphate precipitation, microinjection or electroporation. Alternatively or 
additionally, the nucleic acid can be incorporated into an expression or insertion 
vector for incorporation into the cells. Vectors that contain both a promoter and a 

10 cloning site into ^yhich a polynucleotide can be operatively linked are well known in 
the art. Such vectors are capable of transcribing RNA in vitro ox in vivo, and are 
commercially available fi:om sources such as Stratagene® (La JoUa, CA) and 
Promega Biotech® (Madison, WI). In order to optimize expression and/or in vitro 
transcription, it may be necessary to remove, add or alter 5' and/or 3' imtranslated 

15 portions of the clones to eliminate extra, potential inappropriate alternative translation 
initiation codons or other sequences that may interfere with or reduce expression, 
either at the level of transcription or translation. Alternatively, consensus ribosome 
binding sites can be inserted immediately 5' of the start codon to enhance expression. 
Examples of vectors are viruses, such as baculovirus and retrovims, bacteriophage, 

20 adenovirus, adenoassociated vims, cosmid, plasmid, fimgal vectors and other 
recombination vehicles typically used in the art which have been described for 
expression in a variety of eukaryotic and prokaryotic hosts, and may be used for gene 
therapy as well as for simple protein expression. 

Among these are several non-viral vectors, including DNA/liposome 

25 complexes, and targeted viral protein DNA complexes. To enhance delivery to a cell, 
the nucleic acid or proteins of this invention can be conjugated to antibodies or 
binding fragments thereof which bind cell surface antigens. Liposomes that also 
comprise a targeting antibody or fragment thereof can be used in the methods of this 
invention. This invention also provides the targeting complexes for use in the 

30 methods disclosed herein. 

Polynucleotides are inserted into vector genomes using methods well 
known in the art. For example, insert and vector DNA can be contacted, under 
suitable conditions, with a restriction enzyme to create complementary ends on each 
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molecule that can pair with each other and be joined together with a ligase. 
Alternatively, synthetic nucleic acid linkers can be ligated to the termini of restricted 
polynucleotide. These synthetic linkers contain nucleic acid sequences that 
correspond to a particular restriction site in the vector DNA. Additionally, an 
5 oligonucleotide containing a termination codon and an appropriate restriction site can 
be ligated for insertion into a vector containing, for example, some or all of the 
following: a selectable marker gene, such as the neomycin gene for selection of stable 
or transient transfectants in mammalian cells; enhancer/promoter sequences from the 
immediate early gene of human CMV for high levels of transcription; transcription 
10 termination and RNA processing signals from SV40 for mRNA stability; SV40 
polyoma origins of replication and CoIEl for proper episomal replication; versatile 
multiple cloning sites; and T7 and SP6 RNA promoters for in vitro transcription of 
sense and antisense RNA. Other means are well known and available in the art. 

One can determine if the object of the method, i.e., reversal of the 
15 neoplastic state of the cell, has been achieved by a reduction of cell division, 
differentiation of the cell or assaying for a reduction in proto-oncogene 
overexpression. Cellular differentiation can be monitored by histological methods or 
by monitoring for the presence or loss of certain cell surface markers, which may be 
associated with an undifferentiated phenotype, e.g. CD34 on primativ e p rimitive 
20 hematopoietic stem cells. 

Kits containing the agents and instructions necessary to perform the 
screen and in vitro method as described herein also are claimed. 

When the subject is an animal such as a rat or mouse, the method 
provides a convenient animal model system which can be used prior to clinical testing 
25 of the therapeutic agent, in this system, a candidate agent is a potential drug if proto- 
oncogene expression is reduced in a neoplastic lung cell or if symptoms associated or 
correlated to the presence of cells containing proto-oncogene overexpression are 
ameliorated, each as compared to untreated, animal having the pathological cells. It 
also can be usefiil io have a separate negative control group of cells or animals which 
30 are healthy and not treated, which provides a basis for comparison. 

These agents of this invention and the above noted compounds and 
their derivatives may be used for the preparation of medicaments for use in the 
methods described herein. 
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In a preferred embodiment, an agent of the invention is administered to 
treat lung cancer. In a further preferred embodiment, an agent of the invention is 

administered to treat non-small cell lung cancer. Therapeutics of the invention can 

1 

also be used to prevent progression from a pre-neoplastic or non-nialignant state into 
5 a neoplastic or a malignant state. 

Various delivery systems are known and can be used to administer a 
therapeutic agent of the invention, e.g., encapsulation in liposomes, microparticles, 
microcapsules, expression by recombinant cells, receptor-mediated endocytosis (see, 
e.g., Wu and Wu, (1987), J. BioL Chem. 262:4429-4432), construction of a 

10 therapeutic nucleic acid as part of a retroviral or other vector, etc. Methods of 
delivery include but are not limited to intra-arterial, intra-muscular, intravenous, 
intranasal, and oral routes. In a specific embodiment, it may be desirable to 
administer the pharmaceutical compositions of the invention locally to the area in 
need of treatment; this may be achieved by, for example, and not by way of limitation, 

1 5 local infiision during surgery, by injection, or by means of a catheter. 

The agents identified herein as effective for their intended purpose can 
be administered to subjects or individuals susceptible to or at risk of developing a 
disease correlated to the overexpression of these proto-oncogenes. When the agent is 
administered to a subject such as a mouse, a rat or a human patient, the agent can be 

20 added to a pharmaceutically acceptable carrier and systemically or topically 

administered to the subject. To determine patients that can be beneficially treated, a 
tumor sample is removed from the patient and the cells are assayed for the 
overexpression of the proto-oncogene. Therapeutic amoxmts can be empirically 
determined and will vary with the pathology being treated, the subject being treated 

25 and the efficacy and toxicity of the agent. When delivered to an animal, the method is 
usefiil to further confirm efficacy of the agent. As an example of an animal model, 
groups of nude mice (Balb/c NCR nulnu female, Simonsen, Gilroy, CA) are each 
subcutaneously inoculated with about 10 to about 19 hyperproliferative, cancer or 
target cells as defined herein. When the tumor is established, the agent is 

30 administered, for example, by subcutaneous injection around the tumor. Tumor 

measurements to determine reduction of tumor size are made in two dimensions using 
venier calipers twice a week. Other animal models may also be employed as 
appropriate. 
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Administration in vivo can be effected in one dose, continuously or 
intermittently throughout the course of treatment. Methods of determining the most 
effective means and dosage of administration are well known to those of skill in the 
art and will vary with the composition used for therapy, the purpose of the therapy, 
5 the target cell being treated, and the subject being treated. Single Or muLtiple 

administrations can be carried out with the dose level and pattern being selected by 
the treating physician. Suitable dosage formulations and methods of administering 
the agents can be found below. 

The agents and compositions of the present invention can be used in 
10 the manufacture of medicaments and for the treatment of h\mians and other animals 
by administration in accordance with conventional procedures, such as an active 
ingredient in pharmaceutical compositions. 

The pharmaceutical compositions can be administered orally, 
intranasally, parenterally or by inhalation therapy, and may take the form of tablets, 
15 lozenges, granules, capsules, pills, ampoules, suppositories or aerosol form. They 
may also take the form of suspensions, solutions and emulsions of the active 
ingredient in aqueous or nonaqueous diluents, syrups, granulates or powders. In 
addition to an agent of the present invention, the pharmaceutical compositions can 
also contain other pharmaceutically active compounds or a plurality of compounds of 
20 the invention. 

More particularly, an agent of the present invention also referred to 
herein as the active ingredient, may be administered for therapy by any suitable route 
including oral, rectal, nasal, topical (including transdermal, aerosol, buccal and 
sublingual), vaginal, parental (including subcutaneous, intramuscular, intravenous and 

25 intradermal) and pulmonary. It will also be appreciated that the preferred route will 
vary with the condition and age of the recipient, and the disease being treated. 

Ideally, the agent should be administered to achieve peak 
concentrations of the active compound at sites of disease. This may be achieved, for 
example, by the intravenous injection of the agent, optionally in saline, or orally 

30 administered, for example, as a tablet, capsule or syrup containing the active 

ingredient. Desirable blood levels of the agent may be maintained by a continuous 
infusion to provide a therapeutic amount of the active ingredient within disease tissue. 
The use of operative combinations is contemplated to provide therapeutic 
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combinations requiring a lower total dosage of each component antiviral agent than 
may be required when each individual therapeutic compoimd or drug is used alone, 
thereby reducing adverse effects. 

While it is possible for the agent to be administered alone, it is 
5 preferable to present it as a pharmaceutical formulation comprising at least one active 
ingredient, as defined above, together with one or more pharmaceutically acceptable 
carriers th e r e for therefore and optionally other therapeutic agents. Each carrier must 
be "acceptable'* in the sense of being compatible with the other ingredients of the 
formulation and not injurious to the patient. 
10 Formulations include those suitable for oral, rectal, nasal, topical 

i 

(including transdermal, buccal and sublingual), vaginal, parenteral (including 
subcutaneous, intramuscular, intravenous and intradermal) and pulmonary 
administration. The formulations may conveniently be presented in xmit dosage form 
and may be prepared by any methods well known in the art of pharmacy. Such 

15 methods include the step of bringing into association the active ingredient with the 
carrier which constitutes one or more accessory ingredients. In general, the 
formulations are prepared by imiformly and intimately bringing into association the 
active ingredient with liquid carriers or finely divided solid carriers or both, and then 
if necessary shaping the product. 

20 Formulations of the present invention suitable for oral administration 

may be presented as discrete vmits such as capsules, cachets or tablets, each 
containing a predetermined amount of the active ingredient; as a powder or granules; 
as a solution or suspension in an aqueous or non-aqueous liquid; or as an oil-in-water 
liquid emulsion or a water-in-oil liquid emulsion. The active ingredient may also be 

25 presented a bolus, electuary or paste. 

A tablet may be made by compression or molding, optionally with one 
or more accessory ingredients. Compressed tablets may be prepared by compressing 
in a suitable machine the active ingredient in a free-flowing form such as a powder or 
granules, optionally mixed with a binder (e.g, , povidone, gelatin, 

30 hydroxypropyhnethyl cellulose), lubricant, inert diluent, preservative, disintegrant 
(e,g,, sodium starch glycolate, cross-linked povidone, cross-linked sodium 
carboxymetKyl cellulose) surface-active or dispersing agent. Molded tablets may be 
made by molding in a suitable machine a mixture of the powdered compound 
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moistened with an inert liquid diluent. The tablets may optionally be coated or scored 
and may be formulated so as to provide slow or controlled release of the active 
ingredient therein using, for example, hydroxypropylmethyl cellulose in varying 
proportions to provide the desired release profile. Tablets may optionally be provided 
5 with an enteric coating, to provide release in parts of the gut other than the stomach. 

Formulations suitable for topical administration in the mouth include 
lozenges comprising the active ingredient in a flavored basis, usually sucrose and 
acacia or tragacanth; pastilles comprising the active ingredient in an inert basis such 
as gelatin and glycerin, or sucrose and acacia; and mouthwashes comprising the active 
10 ingredient in a suitable liquid carrier. 

Pharmaceutical compositions for topical administiration according to 
the present invention may be formulated as an ointment, cream, suspension, lotion, 
powder, solution, past, gel, spray, aerosol or oil. Altematively, a formulation may 
comprise a patch or a dressing such as a ban4age or adhesive plaster impregnated with 
15 active ingredients and optionally one or more excipients or diluents. 

If desired, the aqueous phase of the cream base may include, for 
example, at least about 30% w/w of a polyhydric alcohol, i.e., an alcohol having two 
or more hydroxyl groups such as pmpylene glycol, butane-1 ,3-diol, mannitol, 
sorbitol, glycerol and polyethylene glycol and mixtures thereof. The topical 
20 formulations may desirably include a compound which enhances absorption or 

penetration of the agent through the skin or other affected areas. Examples of such 
dermal penetration enhancers include dimethylsulfoxide and related analogues. 

The oily phase of the emulsions of this invention may be constituted 
firom known ingredients in an known manner. While this phase may comprise merely 
25 an emulsifier (otherwise known as an emulgent), it desirably comprises a mixture of 
at lease one emulsifier with a fat or an oil or with both a fat and an oil. Preferably, a 
hydrophilic emulsifier is included together with a lipophilic emulsifier which acts as a 
stabilizer. It is also preferred to include both an oil and a fat. Together, the 
emulsifier(s) with or without stabilizer(s) make up the so-called emulsifying wax, and 
30 the wax together with the oil and/or fat make up the so-called emulsifying ointment 
base which forms the oily dispersed phase of the cream formulations. 
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Emulgents and emulsion stabilizers suitable for use in the formulation 
of the present invention include ¥ween -TWEEN® 60. S^an -SPAN® 80, cetosteaiyl 
alcohol, myristyl alcohol, glyceryl monostearate and sodixmi lauryl sulphate. 

The choice of suitable oils or fats for the formulation is based on 
5 achieving the desired cosmetic properties, since the solubility of the active compound 
in most oils likely to be used in pharmaceutical emulsion formulations is very low. 
Thus the cream should preferably be a non-greasy, non-staining and washable product 
with suitable consistency to avoid leakage from tubes or other containers. Straight or 
branched chain, mono- or dibasic alkyl esters such as di-isoadipate, isocetyl stearate, 
10 propylene glycol diester of coconut fatty acids, isopropyl myristate, decyl oleate, 
isopropyl palmitate, butyl stearate, 2-ethyihexyl palmitate or a blend of branched 
chain esters known as Crodamol CAP may be used, the last three being preferred 
esters. These may be used alone or in combination depending on the properties 
required. Altematively, high melting point lipids such as white soft paraffin and/or 
15 liquid paraffin or other mineral oils can be used. 

Formulations suitable for topical administration to the eye also include 
eye drops wherein the active ingredient is dissolved or suspended in a suitable carrier, 
especially an aqueous solvent for the agent. 

Formulations for rectal administration may be presented as a 
20 suppository with a suitable base comprising, for example, cocoa butter or a salicylate. 

Formulations suitable for vaginal administration may be presented as 
pessaries, tampons, creams, gels, pastes, foams or spray formulations containing in 
addition to the agent, such carriers as are known in the art to be appropriate. 

Formulations suitable for nasal administration, wherein the carrier is a 
25 solid, include a coarse powder having a particle size, for example, in the range of 

about 20 to about 500 microns which is administered in the manner in which snuff is 
taken, i.e., by rapid inhalation through the nasal passage from a container of the 
powder held close up to the nose. Suitable formulations wherein the carrier is a liquid 
for administration as, for example, nasal spray, nasal drops, or by aerosol 
30 administration by nebulizer, include aqueous or oily solutions of the agent. 

Formulations suitable for parenteral administration include aqueous 
and non-aqueous isotonic sterile injection solutions which may contain anti-oxidants, 
buffers, bacteriostats and solutes which render the formulation isotonic with the blood 
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of the intended recipient; and aqueous and non-aqueous sterile suspensions which 
may include suspending agents and thickening agents, and Hposomes or other 
microparticulate systems which are designed to target the compound to blood 
components or one or more organs. The fonnulations may be presented in unit-dose 
5 or multi-dose sealed containers, for example, ampoules and vials, and may be stored 
in a freeze-dried (lyophilized) condition requiring only the addition of the sterile 
liquid carrier, for example water for injections, immediately prior to use. 
Extemporaneous injection solutions and suspensions may be prepared from sterile 
powders, granules and tablets of the kind previously described. 
10 Preferred unit dosage formulations are those containing a daily dose or 

unit, daily subdose, as herein above-recited, or an appropriate fraction thereof, of a 
agent. 

It should be understood that in addition to the ingredients particularly 
mentioned above, the formulations of this invention may include other agents 

15 conventional in the art having regard to the type of formulation in question, for 
example, those suitable for oral administration may include such further agents as 
sweeteners, thickeners and flavoring agents. It also is intended that the agents, 
compositions and methods of this invention be combined with other suitable 
compositions and therapies. 

20 In another aspect, the proto-oncogenes provided herein can be used to 

generate transgenic animal models. In recent years, geneticists have succeeded in 
creating transgenic animals, for example mice, by manipulating the genes of 
developing embryos and introducing foreign genes into these embryos. Once these 
genes have integrated into the genome of the recipient embryo, the resulting embryos 

25 or adult animals can be analyzed to determine the fimction of the gene. The mutant 
animals are produced to imderstand the fimction of known genes in vivo and to create 
animal models of human diseases, (see, e,g, Chisaka et al (1992) 355:516-520; 
Joyner et al (1992) in POSTIMPLANTATION DEVELOPMENT IN THE MOUSE 
(Chadwick and Marsh, ads., John Wiley & Sons, United Kingdom) pp:277-297; Dorm 

30 etaL( 1992) Nature 359:211 -2 15). 

U.S. Patent Nos. 5,464,764 and 5,487,992 describe one type of 
transgenic animal in which the gene of interest is deleted or mutated sufficiently to 
disrupt its fimction. (See, also U.S. Patent Nos. 5,631,153 and 5,627,059). These 
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"knock-out" animals, made by taking advantage of the phenomena of homologous 
recombination, can be used to study the function of a particular gene sequence in viva. 
The polynucleotide sequences described herein are useful in preparing animal models 
of lung cancer. 

5 The following examples are intended to illustrate, but riot limit this 

invention. 

EXAMPLES 

Methods 

SAGE Analysis of cDNAs Derived from Tumor or Normal Lungs 

10 Primary squamous cell lung cancers containing over 95% neoplastic 

components from two imrelated patients were selected for SAGE analysis. Patient A 
was 58-year old and diagnosed with moderately differentiated cancer at the lower 
right lobe of the limg at the time of surgery. Patient B was 68-year old and diagnosed 
with poorly differentiated cancer of the lower right lobe. Normal small airway 

15 epithelial cells obtained from two age and gender-matched independent individuals 
were used as the negative controls. All cancer cell lines were obain e d obtained from 
the American Type Culture Collection and propagated according to the instructions 
provided. 

A systematic analysis of transcripts present in non-small cell lung 
20 cancer (NSCLC) was performed by Serial Analysis of Gene Expression ("SAGE") 
(U.S. Patent No. 5,695,937). SAGE analysis involves identifying nucleotide 
sequences expressed in the antigen-expressing cells. Briefly, SAGE analysis begins 
with providing complementary deoxyribonucleic acid (cDNA) from (1) the neoplastic 
population and (2) normal cells. Both cDNAs can be linked to primer sites. 
25 Sequence tags are then created, for example, using the appropriate primers to amplify 
the DNA. By measuring the differences in these tags between the two cell types, 
sequences which are aberrantly expressed in the neoplastic cell population can be 
identified. 

The SAGE libraries were constructed essentially as described in 
30 Veiculescu et al (1995) Science 270:484-487. PolyA RNAs isolated from lung 
tumors of patients A and B, and from normal small airway epithelial cells were 
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converted to double-stranded cDNA. The cDNA was then cleaved with an anchoring 
enzyme Nialil and divided into two pools. Linkers containing recognition sites for the 
tagging enzyme BsmFI was ligated to each pool: After BsmFl roGtrioiton r estriction, 
SAGE tag overhangs were fiUed-in with Kienow, and tags from the two pools were 
5 combined and ligated to each other. The ligation product was diluted and then 

amplified by PGR. The resulting PGR product was then analyzed by polyacrylamide 
gel electrophoresis (PAGE), and the PGR. product containing two tags ligated tail to 
tail (ditag) was excised and then cleaved with Nialll. After NlalJl restriction, the 
ditags was excised and self-ligated. The concatenated products were separated by 

10 PAGE and those containing — 500 to 2000 nucleotide base pairs were excised and 
cloned for subsequent sequence analysis. 

The sequence and the occurrence of each of the transcript tags was 
determined using SAGE software, described, for example in Venter ci al. (1996) 
Nature 381:364-366. To identify transcript tags present in each library, the sequences 

15 of all SAGE tags were stored as "tag" file in Microsoft Access®. The G e nBanlc® 
GENBANK® dbEST and nucleotide databases were also analyzed by the SAGE 
software to identify the corresponding SAGE tags and then stored as a "Genename." 
The G e nBanlc GENBANK® entry for each SAGE tag was obtained by linking the 
tags fi*om the Tag and Genename files using Microsoft Access®. The relative 

20 occurrence of each tag was determined by comparing the number of tags observed in 
the tumor libraries with that observed in the normal control libraries. The relative 
abundance far the tags was calculated by dividing the total number of tags observed 
with the total nurnber of tags identified. 

Northern Blot Analysis 

25 To establish the clinical significance of the differential expression of 

the SAGE transcripts in lung tumorigenesis, Northern blot analysis was performed 
using total RNA from normal tissues and a panel of NSCLC cell lines using 
oligonucleotide probes complementary to the transcripts of PGP9.5, b-myb or 8-oxo- 
dGTPase. Procedures for canying out Northem blot analysis are detailed in 

30 CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, supra. In primary lung 
tumors, a low 8-oxo-dGTPase mRNA level was detected in 3/8 cases examined. 
However, PGP9.5 mRNA was overexpressed in 10/18 cases and b-myb was 
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overexpressed in 15/18 cases. Whereas transcripts of these genes were not detected in 
normal lung cell sample, they were invariably detected in more than 90% of the 
examined Ivmg tumor cell lines (see Table 2). 

Comparative Northern blot analyses of the normal cell lines and 
5 NSCLC cell lines (see Figures 2 and 3) confirmed an increase in PGP9.5, 8-oxo- 
dOTPase or b-myb mRNA levels in NSCLC cells, thus establishing an involvement 
of these two genes in lung tumiorig e nesis tumorigenesis, and particularly in non-small 
cell limg cancer. 

Western Blot Analysis 

10 Cell lysates from 5x1 cells of each cell line described herein were 

electrophoresed on a 4-20% SDS gradient gel and transferred to a PVDF membrane 
(MSI). After blocking non-specific site by incubating in PBS+5% non-fat dry milk 
(NFDM), the membrane was incubated with anti-PGP 9.5 antibody (Biogenesis, UK) 
at 1 :400 dilution. ECL kit (Amersham®) was used to visualize the antibody binding 

1 5 to PGP9.5 protein (see Fig 3B bottom panel). 

Results 

Four independent SAGE libraries were constructed from messenger 
RNAs using two squamous cell lung cancers and two normal lung small airway 
epithelial cell cultures as described in Madden et al (1997) Oncogene 15:1079-1085. 

20 A total of 2,000-4,000 clones were sequenced to identify more than 50,000 transcripts 
tags from each library (Table 1). The sequences of over 50,000 tags that represent 
about 15,000 unique transcripts in each library were analyzed in order to generate a 
comprehensive profile of gene expression patterns in limg cancer. In total, 226,876 
tags were sequenced, jointly representing 43,254 unique transcripts. G e nBanlc® 

25 GENBANK® analysis suggested that about 40% of the SAGE tags had at least one 
match in the database. As summarized in Table 1, an examination of the SAGE tags 
identified from each sample indicated that the occurrence of tags within each tissue 
type was highly consistent because only 15 and 17 tags were differentially expressed 
by more than 10 fold when the two normal control libraries were compared with each 

30 other. SimilarLy, 36 and 39 tags, respectively, were expressed differentially by more 
than 10-fold between the two tumors. Therefore, the SAGE tags obtained from the 

-34- 

PA/52153077.1 



GZ 2018.00 
WO 03/015748 



two normal controls and the two tumors werexombined to determine the total number 
of occurrence for each tag. 

A comparison of the number of tags present in the txmior and normal 
libraries indicated that a majority of the transcripts were expressed at similar levels 
5 (Fig. 1). However, 142 transcripts were overexpressed 10 folds or more in the tumor 
and about 175 as many were overexpressed in the normal control. Table 1 
summarizes the comparative SAGE analyses of cDNA clones derived from the lung 
cancers of two individuals and the lungs of two normal individuals. 



Table 1. Summary of SAGE analysis. 





Tumor A 


Tumor B 


Normal 
Lung 1 


Normal 
Lung 2 


Overall 


Total clones 


2,259 


2,186 


3,759 


4,046 


12,250 


Total tags 


56,817 


51,901 


58,273 


59,885 


226,876 


Unique tags 


17,535 


16,443 


15,070 


15,667 


43,254 


GenBank match" 


8,867 


8,445 


7,596 


7,801 


18,553 


Tags > 1 OX* 


36 


39 


17 


25 


142/175 



10 ^ The number of tags that matched an entry in G e nBanl c GENBANK® 

* Number of tags that matched differentially expressed at the indicated folds 
when compared to each other. The overall values (tumor/normal) were obtained by 
comparing the tags identified in the two tumors (A and B) with those in the two 
normal samples (1 and 2). 

15 

All tags were searched against G e nBanlc® GENBANK® to identify 
the corresponding gene transcripts or EST clones. The highest level of relative gene 
expression in the normal control was about 150 fold (data not shown), whereas the 
highest level of expression in the tumor was 57 fold compared to the normal control. 
20 In order to identify genes that were consistently overexpressed in most 

lung cancers but not present in the normal lung, each candidate gene was screened by 
Northern blot analyses using total RNA from normal lung, liver, small intestine, and 9 
lung tumor cell lines. (See Figure 2), 

The transcripts to 15 significantly overexpressed tags with authentic 
25 EST or GenBanlc® GENBANK® m atches have been examined. Among the tested 
genes, 10 were excluded from fiirther analysis because they were either commonly 
expressed (6 genes) or virtually absent (4 genes) in most tissues and cell lines. (See, 
Table 2). Interferon-a-inducible gene was detected in the normal lung and therefore 
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was not further analyzed. The transcripts for four genes, each encoding 8-oxo- 
dGTpase (human Mull), b-myb, p67 or PGP9.5, were not detectable in the normal 
lung samples but were easily detectable in most of the lung cancer cell lines tested 
(Figs. 2 and 38). The abundance of these four genes ranged from 0.0 12% to 0.0 18% 
of the total tags identified in the tumor or about 36-54 transcripts/cell, assuming that 
there are approximately 300,000 transcripts in a cell. 




I; 



P 

p 

f r 
1 § 



In Table 2, the values on the left svim of SAGE tags identified in tumor 
(T) or normal libraries (N). The results of the Northern analysis are shown at the right 
of the table. All lung tumor cell lines (H82, H1299, A549, SKEM, SW900, H125, 
H157, H520, H2170, and L132) were obtained from ATCC and propagated according 
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to the instructions. Except for LI 32, all other cell lines were derived from lung 
cancers. 

To determine the biological relevance of ovoropros e ntation 
overpresentation of these transcripts, the presence of these transcripts in primary 
5 tumors was examined. It was foimd that b-myb transcripts were overexpressed in 
15/18 primary tumors (see Fig. 3 A upper panel that depicts results from selected 

r 

primary tumor samples) and the PGP9.5 xnRNAs were overexpressed in 10/18 
primary tumors. 8-oxo-dGTPase was detected in 38 samples (see Fig. 3 A lower panel 
and also Table 3 that summarizes the expression pattern of the human PGP9.5 gene 
10 transcript in all examined primary lung cancers). Interestingly, all four lung cancer 
cases with lymph node metastases expressed the PGP9.5 message, showing a possible 
correlation of PGF9.5 overexpression and clinical manifestation of the disease. 

Table 3. Expression of PGP9.S gene in primary lung cancer 



Primary lung Northern blot Histology Lymphnode 
tumor analysis metastasis 



L002 


+ 


adeno 




L004 




adeno 




L006 




adeno 




L008 


+ 


adeno 




LOlO 




squamous 




L012 


+ 


adeno + small cell 




L014 




bronchioaleolar 




L016 




adeno 




IT 


+ 


squamous 


+ 


3T 


+ 


squamous 




84 




squamous 




85 


+ 


squamous 




86 


+ 


squamous 




87 




squamous 




88 


+ 


squamous 




89 




squamous 




90 




squamous 




91 


+ 


squamous 
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Since lung cancers often have features of neuronal differentiation 
(Mackay, et al (1990) Tumors of the Lung in MAJOR PROBLEMS IN Pathology, 
Volume 24, W.B. Saunders Co. Philadelphia), and PGP 9.5 gene was originally 
identified as a neuro-specific ubiqutinhydrolase (Wilkinson, et al (1989) Science 
5 246:670-673), it was next determined whether PGP9.5 expression was associated with 
this phenotype. A panel of established lung cancer cell lines having defined 
neuroendocrine features based on hASH I gene status (see Borges, et al (1997) 
Nature 386:852-855) were used. Although expression of hASH 1 is essential for 
neurodifferentiation of the limg and is one of the most reliable neuroendocrine 

10 markers (Ermisch, et al (1995) Clin. Neuropath. 3:130-136), PGP9.5 protein was 
abundentlv abundantly expressed in nearly all lung cancer cell lines independent of 
hASHl status (Fig. 3B). 

It is also possible that PGP9.5 expression was associated with the high 
rate of proliferation common to all cancers. To address this issue, a panel of 12 head 

15 and neck cancer and bladder cancer cell lines were tested to determine the tissue 

specificity of PGP9.5 overexpression. Head and neck cancers are invariably derived 
fi-om squamous cell origin and usually have histopathological and genetic 
characteristics similar to those of squamous cell lung cancer. However, only 3 of the 
6 head and neck tumor cell lines had PGP9.5 message and none of the 12 cancers 

20 expressed the encoded protein (see Table 4). These observations show that the 

overexpression of PGP9.5 is primarily restricted to lung cancers and that the presence 
of PGP9.5 protein is independent of neuronal differentiation but may contribute to 
lung tumor development by deubiqutination of target proteins. 
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Table 4. PGP9.5 expression and the character of other tumors. 



Cell line Northern blot Histology 

analysis 

FADU - head and neck cancer 

022 + head and neck cancer 

06 + head and neck cancer 

11 - head and neck cancer 

12 + head and neck cancer 
29 - head and neck cancer 
J82 - bladder cancer 
SCaBER - bladder cancer 
T24i - bladder cancer 
S637 - bladder cancer 
HT1376 - bladder cancer 
HT1197 - bladder cancer 



Table 5 shows expression of PGP9.5 and B-myb in human cancers as 
determined by Northern blot analysis. Both PGP9.5 and B-myb were expressed at 
much lower frequencies in primary cancers derived from colon, bladder and kidney. 

Table 5. Expression of PGP9.5 and B-myb in human cancers. 



Tumor Type PGP9.5(%) B-myb(%) 

Lung cell lines 22/23 (96) . 23/23 (100) 
Primary Cancer 

Lung 10/18 (56) 15/18(83) 

Colon 0/5 (0) 0/5 (0) 

Bladder 0/4 (0) 2/4 (50) 

Kidney 2/7 (29) 1/7 (14) 



It is to be understood that while the invention has been described in 
conjimction with the above embodiments, that the foregoing description and the 
following examples are intended to illustrate and not limit the scope of the invention. 
Other aspects, advantages and modifications within the scope of the invention will be 
apparent to those skilled in the art to which the invention pertains. 
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